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DETAILED ACTION 
Remarks 

1. The Amendment filed on 13-March-2006 has been received and entered. Claims 1-24 are 
pending. 

2. Amendments to the claims have overcome the 35 USC 101, and 35 USC 1 12, second 
paragraph rejections. 

Response to Amendment 

3. The Declaration filed on March 13, 2006 under 37 CFR 1.131 is sufficient to overcome 
the Cristianini et al. article reference. 

EXAMINER'S AMENDMENT 

4. An examiner's amendment to the record appears below. Should the changes and/or 
additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 
1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the 
payment of the issue fee. 

Authorization for this examiner's amendment was given in a telephone interview with 
Mr. Thomas C. Fiala (Attorney of Record) on March 27, 2006. 



Amendments to the Claims: 
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5. This listing of the claims will replace all prior versions, and listings, of claims in the 
application: 

Listing of Claims: 

6. The application has been amended as follows: 

Claim 1: (Previously Presented) A computer-based method for representing latent 
semantic content of a plurality of documents, each document containing a plurality of terms, the 
method comprising: 

deriving at least one n-tuple term from the plurality of terms; 

forming a two-dimensional matrix, each matrix column c corresponding to a document, 

each matrix row r corresponding to a term occurring in at least one document 
corresponding to a matrix column, 

each matrix element (r, c) related to a number of occurrences of the term; 

corresponding to the row r in the document corresponding to column c, at least one 
matrix element related to the number of occurrences of the at least one n-tuple term occurring in 
the at least one document, and 

performing singular value decomposition and dimensionality reduction on the matrix to 
form a latent semantic indexed vector space and storing the latent semantic indexed vector space 
in an electric form accessible to a user. 
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Claim 2: (Currently Amended) The inv e ntion computer-based method as recited in claim 

1 further comprising: 

identifying an occurrence threshold; ■ 

wherein n-tuples that appear less times in the document collection than the occurrence 
threshold are not included as elements of the matrix 

Claim 3: (Currently Amended) The inv e ntion computer-based method as recited in claim 

2 wherein the occurrence threshold is two. 

Claim 4: (Currently Amended) The inv e ntion computer-based method as recited in claim 
1 wherein deriving at least one n-tuple term further comprises: 

creating the at least one n-tuple term from n consecutive verbatim terms. 

Claim 5: (Previously Presented) A computer-based method for determining conceptual 
similarity between a subject document and at least one of a plurality of reference documents, 
each reference document containing a plurality of terms, the method comprising: 

deriving at least one n-tuple term from the plurality of terms; 

forming a plurality of two-dimensional matrices wherein, for each matrix: 

each matrix column c corresponds to a document, wherein one column corresponds to the 
subject document and the remaining columns correspond to the reference documents; 

each matrix row r corresponds to a term occurring in at least one of the subject document 
or the reference documents, 
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each matrix element (r, c) represents a number of occurrences of the term corresponding 
to r in the document corresponding to c; 

performing singular value decomposition and dimensionality reduction on the plurality of 
formed matrices, to form a plurality of latent semantic indexed vector spaces, 

the plurality of latent semantic indexed vector spaces including at least one space formed 
from a matrix including at least one element corresponding to the number of occurrences of at 
least one n-tuple term in at least one document, 

determining at least one composite similarity measure between the subject document and 
the at least one reference document as a function of a weighted similarity measure of the subject 
document to the at least one reference document in each of the plurality of indexed vector spaces 
and storing the at least one composite similarity measure in an electric form accessible to a user. 

Claim 6: (Previously Presented) The method as recited in claim 5 wherein the at least one 
composite similarity measure comprises weighing similarity measures from vector spaces 
comprising greater numbers of n-tuples greater than similarity measures from vector spaces 
comprising lesser number of n-tuples. 

Claims 7-13 (canceled) 

Claim 14: (Previously Presented) A computer-based method for characterizing results of 
a query comprising: 
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automatically identifying n-tuples included in a collection of documents based on an 
analysis of the collection of documents, wherein each document in the collection of documents 
contain a plurality of terms; 

forming a latent semantic indexed vector space based on (i) the documents in the 
collection of documents, (ii) the plurality of terms, and (iii) the automatically identified n-tuples; 

querying the latest semantic indexed vector space with a query having at least one term; 

ranking results of the querying step as a function of at least a frequency of occurrence of 
the at least one term, thereby generating a characterization of the results; and 

storing the characterization in an electronic form accessible to a user. 

Claim 15: (Original) The method as recited in claim 14 wherein at least one term used in 
ranking is a query term. 

Claim 16: (Original) The method as recited in claim 15 wherein the at least one query 
term used in ranking is a generalized entity. 

Claim 17: (Original) The method as recited in claim 14 wherein the at least one term used 
in ranking is a generalized entity. 



Claims 18-21 (canceled) 
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Claim 22: (Previously Presented) A computer-based method for representing latent 
semantic content of a plurality of documents, each document containing a plurality of verbatim 
terms, the method comprising: 

deriving at least one expansion phrase from the verbatim terms, each expansion phrase 
comprising terms; 

replacing at least one occurrence of a verbatim term having an expansion phrase with the 
expansion phrase corresponding to that verbatim term; 
forming a two-dimensional matrix, 
each matrix column c corresponding to a document; 
each matrix row r corresponding to a term); 

each matrix element (r, c) representing a number of occurrences of the term 
corresponding to r in the document corresponding to c; 

at least one matrix element corresponding to the number of occurrences of one at least 
one term occurring in the at least one expansion phrase, and 

performing singular value decomposition and dimensionality reduction on the matrix to 
form a latent semantic indexed vector space and storing the latent semantic indexed vector space 
in an electronic form accessible to a user. 

Claim 23: (Previously Presented) A computer-based method for representing the latent 
semantic content of a plurality of documents, each document containing a plurality of terms, the 
method comprising: 

identifying at least one idiom among the documents, 
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each idiom containing at least one idiom term; 

forming a two-dimensional matrix, 

each matrix column corresponding to a document; 

each matrix row corresponding to a term occurring in at least one document represented 
by a row; 

each matrix element representing a number of occurrences of the term corresponding to 
the element's row in the document corresponding to element's column; 

at least one occurrence of at least one idiom term being excluded from the number of 
occurrences corresponding to that term in the matrix, 

performing singular value decomposition and dimensionality reduction on the matrix to 
form a reduced matrix and storing the reduced matrix in an electronic form accessible to a user. 

Claim 24: (Previously Presented) A computer-based method for representing the latent 
semantic content of a plurality of documents, each document containing a plurality of terms, the 
method comprising: 

identifying at least one idiom among the documents, 

each idiom containing at least one idiom term; 

replacing at least one identified idiom with a corresponding idiom elaboration, each 
elaboration comprising at least one elaboration term, 
forming a two-dimensional matrix, 
each matrix column corresponding to a document; 
each matrix row corresponding to a term; 
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each matrix element representing a number of occurrences of the term corresponding to 
the element's row in the document corresponding to element's column, 

at least one matrix element corresponding to the number of occurrences of an elaboration 
term in a document corresponding to a matrix column; 

performing singular value decomposition and dimensionality reduction on the 
matrix to form a reduced matrix and storing the reduced matrix in an electronic form accessible 
to a user. 

Allowance 

7. Claims 1-6, 14-17, and 22-24 are allowed over the prior art made of record. 

8. Any comments considered necessary by applicant must be submitted no later than the 
payment of the issue fee and, to avoid processing delays, should preferably accompany the issue 
fee. Such submissions should be clearly labeled "Comments on Statement of Reasons for 
Allowance." 

Conclusion 

9. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Neveen Abel-Jalil whose telephone number is 571-272-4074. 
The examiner can normally be reached on 8:30AM-5:30PM EST. 



Application/Control Number: 09/683,263 Page 10 

Art Unit: 2165 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Jeffrey A. Gaffin can be reached on 571-272-4146. The fax phone number for the 
organization where this application or proceeding is assigned is 571-273-8300. 
Information regarding the status of an application may be obtained from the Patent Application 
Information Retrieval (PAIR) system. Status information for published applications may be 
obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 



Neveen Abel-Jalil 
March 27, 2006 



